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Abstract 

Transform coding is routinely used for lossy compression of discrete sources with memory. The 
input signal is divided into TV-dimensional vectors, which are transformed by means of a linear mapping. 
Then, transform coefficients are quantized and entropy coded. In this paper we consider the problem of 
identifying the transform matrix as well as the quantization step sizes. We study the challenging case in 
which the only available information is a set of P transform decoded vectors. We formulate the problem 
in terms of finding the lattice with the largest determinant that contains all observed vectors. We propose 
an algorithm that is able to find the optimal solution and we formally study its convergence properties. 
Our analysis shows that it is possible to identify successfully both the transform and the quantization step 
sizes when P > N + 5 where S is a small integer, and the probability of failure decreases exponentially 
to zero as P — N increases. 

I. Introduction 

Transform coding has emerged over the years as the dominating compression strategy. Transform coding 
is adopted in virtually all multimedia compression standards including image compression standards such 
as JPEG Q and JPEG 2000 ED, and video compression standards such as, for example, H.264/AVC 
and HEVC @. This is due to the fact that transform coders are very effective and yet computationally 
inexpensive since the encoding operation is divided into three relatively simple steps: the computation 
of a linear transformation of the data, scalar quantization of each coefficient, and entropy coding. 
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Transform coding has been widely studied in the last decades and many important results and optimality 
conditions have been derived. For example, it is well known that, for Gaussian sources, the Karhunen- 
Loeve Transform (KIT) is the optimal transform [6][7|. Moreover, the analysis of popular transform 
coders used in image compression has also led to new insights and new interesting connections between 
compression and non-linear approximation theory (8). In particular, this analysis has also clarified why 
the wavelet transform is the best transform for compressing piecewise regular functions l9Hl0l . Further 
insights on the interplay between linear transform, quantization and entropy coding can be found in iTTTTl 
for the case of integer-to-integer transforms. 

Due to its centrality to any type of multimedia data, transform coding theory is now extensively 
used in a new range of applications that rely on the possibility of reverse-engineering complex chains 
of operators starting from the available output signals. Indeed, the lifespan of a multimedia signal is 
virtually unbounded. This is due to the ability of creating copies and the availability of inexpensive 
storage options. However, signals seldom remain identical to their original version. As they pass through 
processing chains, some operators, including transform coding, are bound to leave subtle characteristic 
footprints on the signals, which can be identified in order to uncover their processing history. This insight 
might be extremely useful in a wide range of scenarios in the field of multimedia signal processing at 
large including, e.g.,: i) forensics, in order to address tasks such as source device identification lfT2l or 
tampering detection lfT3llfT4l : ii) quality assessment, to enable no-reference methods that rely solely on 
the received signals |[T31 lPT6l ; iii) digital restoration, which requires prior knowledge about the operations 
that affected a digital signal ifTTll . 

In this context, several works have exploited the footprints left by transform coding. In [18], a method 
was proposed to infer the implementation-dependent quantization matrix template used in a JPEG- 
compressed image. Double IPEG compression introduces characteristic peaks in the histogram of DCT 
coefficients, which can be detected and used, e.g, for tampering localization |[T9ll[T4l . More recently, 
similar techniques were applied to video signals for the cases of MPEG-2 mEH, MPEG-4 B21B51 
and H.264/AVC l|24l. 

All the aforementioned works require prior knowledge of the type of standard being considered. This 
implies that the specific transform in use is assumed to be known, whereas the quantization step sizes 
need to be estimated. In practice, it might be useful to be able to infer which transform was used in 
order to understand, for example, whether an image was compressed using the DCT-based JPEG or the 
wavelet-based JPEG 2000 and, in the latter case, which wavelet transform was used. Similarly, it would 
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be good to be able to infer if a video sequence was compressed using MPEG-2, MPEG-4 or H.264/AVC. 
Some efforts in this direction can be found in |25l . 

Most of the above methods focus only on a specific type of multimedia signal (e.g., only images or 
only videos) and are to some extent heuristic. It is therefore natural to try and develop a universal theory 
of transform coder identification that is independent of the specific application at hand. To this end, 
in this paper we consider a general model of transform coding that can be tailored to describe a large 
variety of practical implementations that are found in lossy coding systems, including those adopted 
in multimedia communication. Specifically, a 1 -dimensional input signal is encoded by partitioning it 
into non-overlapping ^-dimensional vectors, which are then transformed by means of a linear mapping. 
Then, transform coefficients are quantized and entropy coded. At the decoder, quantization symbols are 
entropy decoded and mapped to reconstruction levels. Then, the inverse transform is applied to obtain 
an approximation of the signal in its original domain. 

Given the output produced by a specific transform coding chain, we investigate the problem of 
identifying its parameters. To this end, we assume both the size and the alignment of the transform 
to be known, as they can be estimated with methods available in the literature EH (HI. We propose an 
algorithm that receives as input a set of P transform decoded vectors embedded in a iV-dimensional 
vector space and produces as output an estimation of the transform adopted, as well as the quantization 
step sizes, whenever these can be unambiguously determined. We leverage the intrinsic discrete nature 
of the problem, by observing the fact that these vectors are bound to belong to a A r -dimensional lattice. 
Hence, the problem is formulated in terms of finding a lattice that contains all observed vectors. We 
propose an algorithm that is able to solve the problem and we formally study its convergence properties. 
Our analysis shows that it is possible to successfully identify both the transform and the quantization step 
sizes with high probability when P > N. In the experiments we found that an excess of approximately 
6-7 observed vectors beyond the dimension N of the space is generally sufficient to ensure successful 
convergence. In addition, the complexity of the algorithm is shown to grow linearly with N. 

It is important to mention that the method used to solve the problem addressed in this paper is related 
to Euclid's algorithm, which is used to find the greatest common divisor (GCD) in a set of integers. 
Indeed, when N = 1 and P = 2, the proposed method coincides with Euclid's algorithm. However, in 
this case the problem reduces to estimating the quantization step size, as the transform is trivially defined. 

Note that, lattice theory has been widely used for source and channel coding (e.g., 1126*1 . ll27ll . If28l0 . 
However, to the best of the authors' knowledge, this theory has not been employed to address the 
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problem of identifying a linear mapping using the footprint left by quantization. Only ||29l uses similar 
principles but their goal is to investigate the color compression history, i.e., the colorspace used in JPEG 
compression. Therefore, the solution proposed is tailored to work in a 3-dimensional vector space, thus 
avoiding the challenges that arise in higher dimensional spaces. 

Also, it is important not to confuse the problem addressed in this paper with the classical problem of 
lattice reduction ll28l . In the latter case, given a basis for a lattice, one seeks an equivalent basis matrix 
with favorable properties. Usually, such a basis consists of vectors that are short and with improved 
orthogonality. There are several definitions of lattice reduction with corresponding reduction criteria, 
each meeting a different tradeoff between quality of the reduced basis and the computational effort 
required for finding it. The most popular one is the Lenstra-Lenstra-Lovasz (LLL) reduction 11301 . which 
can be interpreted as an extension of the Gauss reduction to lattices of rank greater than 2. 

The rest of this paper is organized as follows. Section In] introduces the necessary notation and 



formulates the transform identification problem and Section III provides the background on lattice 



theory. The proposed method is described in Section IV Then, a theoretical analysis of the convergence 



properties is presented in Section [V| The performance of the transform identification algorithm is evaluated 



empirically in Section VI Finally, Section VII concludes the paper, indicating the open issues and 
stimulating further investigations. 



II. Problem statement 

The symbols x, x and X denote, respectively, a scalar, a column vector and a matrix. A M X N matrix 
X can be written either in terms of its columns or rows. Specifically, 



X 



xi x 2 



X7V 



X? 



x 



T 



(1) 



Let x denote a iV-dimensional vector and W a transform matrix, whose rows represent the transform 
basis functions. 

Transform coding is performed by applying scalar quantization to the transform coefficients y = Wx. 
Let Qi(-) denote the quantizer associated to the i-th transform coefficient. We assume that Qi{-) is a scalar 
uniform quantizer with step size Aj, i = 1,...,N. Therefore, the reconstructed quantized coefficients 
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m_ 



1,...,N. 



(2) 



can be written as y = [yi, 2/2, • • ■ , Vn] T , with 

Vi = Qi{Vi) = Aj - round 

The reconstructed block in the original domain is given by x = W _1 y. 

Let {xi,...,xp} denote a set of P observed iV-dimensional vectors, which are the output of a 
transform coder. Due to quantization, the unobserved vectors representing quantized transform coefficients 
{yi, . . . ,yp} are constrained to belong to a lattice C y described by the following basis: 

Ai ... 
A 2 



B, 







(3) 



... A N 

Therefore, the observed vectors {xi, . . . ,xp} belong to a lattice C x described by the basis: 

B x = [b x ,i> ■ ■ 1 bir.jv] = W^B^, 
with b Xj i = AjWj, i = l,...,N, W _1 = [wi, . . . ,v/ N ]. 



(4) 



In this paper we study the problem of determining B x from a finite set of P > N distinct vectors 
{xi, . . . , xp}. That is, we seek to determine the parameters of a transform coder based on the footprints 
left on its output. We propose an algorithm to solve this problem and we study its convergence properties. 
In addition, we show that the probability of correctly determining (or, equivalently, another basis for 
the lattice C x ) is monotonically increasing in the number of observations P, and rapidly approaching 
one when P > N. Note that when determining B^, the proposed method does not make any assumption 
on the structure of the transform matrix W. In the general case, given B x , it is not possible to uniquely 
determine W and the quantization step sizes Aj, i = 1, . . . , N. Indeed, the length of each basis vector 
h Xt i can be factored out as ||b X) j||2 = Aj||wj||2. However, in the important case in which W represents 
an orthonormal transform, the quantization step sizes Aj, i = 1, ... ,7V, and the transform matrix W 
can be immediately obtained from B x . Indeed, W _1 = W T , w« = Wj, i = 1, . . . , N, with 1 1 1 1 2 = 1. 
Therefore: 



A, 



>x,i\\2, 



Wi 



>x,i\\2 



(5) 
(6) 
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III. Background on lattice theory 

In this section we provide the necessary background on lattice theory. Further details can be found, 
e.g., in ED EH (281. Let C denote a lattice of rank K embedded in R N . Let B = [bi, b 2 , . . . , b^] denote 
a basis for the lattice C. That is, 

C = {x|aibi + a 2 b 2 + . . . + a^b^, (H G Z}. (7) 

In order to make the mapping between a basis and the corresponding lattice explicit, the latter can be 
expressed as £(B). 

Any lattice basis also describes a fundamental parallelotope according to 



V(B) = |x|x = 53^,0 < 9i<l\. 



(8) 



When K = 2,3, 'P(B) is, respectively, a parallelogram or a parallelepiped. As an example, Figure 1(a) 
shows the fundamental parallelotope corresponding to a lattice basis B when K = 2. 

Given a point z G R. K , let V Z (H) denote the parallelotope enclosing z. P Z (B) is obtained by translating 
V(B) so that its origin coincides with one of the lattice points. More specifically, 



V Z (B) = |x|x = B- [B^z] +5^0ibi,O< 9i<l\. 



(9) 



Figure 1(b) illustrates V Z (B) for an arbitrary vector z. 

Different bases for the same lattice lead to different fundamental parallelotopes. For example, Fig- 
1(a) and Figure l(c)| depict two different bases for the same lattice, together with the corresponding 



ure 



fundamental parallelotopes. However, the volume of V(B) is the same for all bases of a given lattice. 
This volume equals the so-called lattice determinant, which is a lattice invariant defined as 



\C\ = ^/det(B T B). (10) 

If the lattice is full rank, i.e., K = N, the lattice determinant equals the determinant of the matrix B, 
\C\ = |det(B)|. 

Let C denote a sub-lattice of C. That is, for any vector x € C, then x G C. A basis B for C can be 
expressed in terms of B as 

B = BU, (11) 
where U is such that Uij G Z. Moreover, let det(U) = ±m, then 

g = |det(U)|=m (12) 




(c) (d) 

Fig. 1. Examples of lattices, (a) The fundamental parallelotope of a lattice defined by a basis B. (b) Parallelotope enclosing 
an arbitrary vector z. (c) Another (equivalent) basis for the lattice in (a), (d) An example of a sub-lattice of the lattice £(B). 



and we say that £ is a sub-lattice of C of index m. For example, Figure 1(d) shows two lattices C and 
C, such that C C C. In this case, the matrix U is equal to 

-4 -5 
3 -1 

and £ is a sub-lattice of index m = 19. 



U 



(13) 



IV. An algorithm for transform identification 

In this section we propose an algorithm that is able to determine the parameters of a transform coder 
from its output, i.e., a set of observed vectors {xi, . . . ,xp}. This is accomplished by finding a suitable 
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lattice C* such that {xi, . . . , xp} C C*. In Section V-C we will show that, with probability approaching 
one, C* = C x , provided that P - N > 0. 

The problem of determining a basis for the lattice C x is complicated by the fact that we typically 
observe a finite (and possibly small) number of vectors P embedded in a possibly large dimensional 
space. More precisely, {xl, . . . , xp} belong to a bounded lattice, in virtue of the fact that each transform 
coefficient yi is quantized with a finite number of bits Ri, to one of 2 Rl reconstruction levels. Let R 
denote the average number of bits allocated to transform coefficients. The number of potential lattice 
points is equal to 

N 

Y[ 2 R =2^ ft = 2 NR , (14) 
i=i 

and only P of them are covered by observed vectors. Thus, we note that, given R, the number of 
lattice points increases exponentially with the dimension N and that in most cases of practical relevance 
P < 2^. 

Another issue arises from the fact that, for a set of vectors {xi, . . . ,xp}, there are infinitely many 
lattices that include all of them. Indeed, any lattice £ such that C x C C is compatible with the observed set 
of vectors. Note that any basis of the form B = B Z U _1 , with det(U) = ±m, with m an integer greater 
than one defines a compatible lattice C. A simple example is obtained setting U = al, a € N, a > 1. 

In order to resolve this ambiguity, we seek the lattice £* that maximizes the lattice determinant \C\, 
within this infinite set of compatible lattices. That is, 

maximize |£(B)| 

AB) (15) 

subject to {xi, . . . , xp} C £(B). 



For example, for the set of observed points {xi, X2, X3} depicted in Figure 2(a) Figure 2(g) illustrates 
a basis for the lattice that is the optimal solution of ( fT5] l. In contrast, the lattice in Figure 2(h) is a feasible 
solution of ( [T5] ), but it is not optimal, since it is characterized by a lower value of the lattice determinant. 

The proposed method used to solve the problem above is detailed in Algorifhm[T] The method constructs 
an initial basis for an A r -dimensional lattice (line [TJ. This is accomplished by considering the vectors 
in O until N linearly independent vectors are found. These vectors are used as columns of the starting 
estimate B^ ) and to populate the initial set of visited vectors S. We denote with U the set of vectors 



in O that have not been visited yet. Then, the solution of ( pL5| > is constructed iteratively, by considering 
the remaining vectors in U one by one. At each iteration, the function recurseTI returns a basis for 



a lattice that solves (15), in which the constraint is imposed only on the subset of visited vectors S, that 
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ALGORITHM 1: TI algorithm 

Input: Set of observed vectors O = {xi, . . . , xp} 

Output: A basis B of the lattice solution of \\5) 

1) B(°) = initBasis(O); 

2) S = {b 1 ,...,b N }\ 

3) U = 0\S; 

4) r = 

5) while card{W} > 0; 

6) Pick x in U; 

7) W=W\{x}; 

8) S = S U x; 

9) B( r+1 ) = recurseTl(BW,5); 

10) r = r + 1 

11) end 



is, 5 C £(B). As such, the algorithm starts finding the solution of an under-constrained problem and 
additional constraints are added as more vectors are visited. 

Figure [2] shows an illustrative example when N = 2 and three vectors {xi,X2, X3} are observed 
(Figure |2(b)| ). The initial basis (line[T]) is constructed using xi and X2, since they are linearly independent 
(Figure 2(b) ). Then, the point X3 is selected (line [6] and Figure 2(c) 1 and the function recurseTI (line [9]) 
returns a basis that solves ( p"5) ), i.e., a basis with the largest lattice determinant that includes all observed 



vectors. Figure 2(f) illustrates such a basis, and Figure 2(g) shows an equivalent basis obtained after 
lattice reduction. 

The core of the method is the recursive function recurseTI. When describing this function, we keep 
a clear distinction between algorithm template and algorithm instance, as it is customary in computer 
science. We start describing the template in Algorithm [2j which does not specify the function entirely. 
Then, a concrete instance of the template is detailed in Algorithm [3] The rationale of maintaining this 
distinction is motivated by the fact that the correctness of the method is a property that descends from 



the template alone, as further discussed in Section V-A| Conversely, the rate of convergence depends on 



the specific algorithm instance, as explained in Section V-B 
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(g) (h) 

Fig. 2. An example of transform identification. A set of three observed vectors is given in (a). Then, (b)-(h) show, step-by-step, 
how the solution to problem {15} is sought by Algorithm [T] 



A, An algorithm template for recurseTI 

The function recurseTI receives as input a set of visited vectors S and the current estimate of a 
basis B for the lattice £(B). If S C C, i.e., all the vectors in S belong to the lattice defined by B, the 
recursion is terminated (line [T] in Algorithm [2]). Otherwise, one of the vectors x that does not belong to 
C is selected (line Q and the parallelotope which encloses it is identified (line [5]). Then, a vector d is 
computed as the difference between x and one of the vertices of the parallelotope (line [6]). The intuition 
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ALGORITHM 2: recurseTl(B, S) 

Input: Set of vectors S = {xi, . .. ,xg}, a basis B of a lattice. 

Output: A basis of a lattice C with maximum determinant \C\, such that S C C 

1) ifSc£(B) 

2) return B 

3) else 

4) Pick ze5\ £(B). 

5) Determine V Z (B). 

6) Pick a vertex v of V Z (B). 

7) Compute d = z v. 

8) Compute B^, replacing the i-th column of B with d. 

9) Pick an index I, such that det(B ; ) ^ 0. 

10) recurseTl(B;,5); 

11) end 



here is to capture a short vector that cannot be represented by the current lattice, and to modify the 
current basis in such a way that (upon convergence) it can be represented. Hence, the updated basis is 
constructed by replacing one of the columns of B with d (line [8]). Among the A?" possible cases, any 



choice such that Bj is non-singular represents a valid selection (line 10 1. 

In the example in Figure [2] two recursive steps are performed before terminating recurseTI. In the 



first call, it is verified that X3 does not belong to the lattice defined by the current basis (Figure 2(c) 1, 



and the updated basis is constructed (Figure 2(d)) by replacing one of the two basis vectors with the 
difference vector between X3 and one of the vertices of V^^B). In the second call it is verified that 
neither X3 nor X2 belong to the updated lattice. Therefore, one of the two difference vectors (e.g., the 
one representing the difference between X2 and one of the vertices of 'Px 2 (B)) is used to replace one of 
the two basis vectors. In the third call the recursion is terminated, because all points in S belong to the 
lattice. 

In Section |V-A it is shown that the recursion always terminates in a finite number of steps and leads 



to the optimal solution of ( [T5] ). The solution the algorithm converges to, though, might be a sub-lattice of 
the underlying lattice C x , i.e., C* C C x . Fortunately, this is a very unlikely event, even when the number 



of observed points P is only slightly larger than N, as further discussed in Section V-C 
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B. An algorithm instance for recurseTI 

A practical instantiation of the template presented in Algorithm [2] requires to specify how to perform 
the choices at line [4j [6] and |9j which were left undefined. Note that these choices are arbitrary and have 
no effect on the correctness of the method, although they might affect the number of recursive steps 
needed to achieve convergence. 

In our specific implementation, the selection of the vector x G S \ £(B) (line [4] in Algorithm [2]), the 
vertex of the parallelotope (line [6]) and the column to be replaced (line [9]) are carried out as detailed 
in Algorithm [3] The rationale is to construct a new basis related to a lattice with the smallest lattice 
determinant |£(B)|, so as to tighten the upper bound on the value of the optimal solution, i.e., \C*\ < 
|£(B)|. 

Specifically, given a basis B as input, we compute the vector x = B • round(B _1 x), which represents 
one of the vertices of the parallelotope enclosing x (line|4]in Algorithm [3]). In order to prevent numerical 
instability induced by the inversion of the matrix B, we perform basis reduction according to the LLL 
algorithm (line [2]) and we find a nearly orthogonal basis which is equivalent to B, but has a smaller 
orthogonality defect. In our implementation, we perform basis reduction only when the condition number 
is greater than a threshold T, which was set equal to 10 4 (line [I]). 

Then, the selected point z = xy is the one that minimizes the distance from the corresponding vertex 
(line HJ. That is, 

/ = arg min ||xj - %|| 2 , (16) 
jefllll^-^UaX)} 

so as to minimize the length of the new basis vector d. Similarly, the choice of the new basis among the set 



of (up to) N candidate bases B^ (line 1 1 1 is to select the one that leads to the smallest lattice determinant, 
after excluding those that do not have rank N. From Cramer's rule, it follows that det(Bj) = ^det(B), 
where = B _1 d is the expansion of d in the basis B. Hence, we replace the Z-th column of B, which 
is the one corresponding to the entry of 9 with the least strictly positive absolute value. That is, 

I = arg min 10,1. (17) 
V. Analysis 

A. Convergence 

In this section, we prove that the proposed algorithm converges in a finite number of recursive steps to 
the solution C* of ( p"5] l. To this end, we rely on the specifications of the algorithm template in Algorithm [2] 
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ALGORITHM 3: recurseTl(B, S) 

Input: Set of vectors S = {xi, . .. ,xg}, a basis B of a lattice. 

Output: A basis of a lattice C with maximum determinant \C\, such that S C C 

1) if condnum(B) > T 

2) B = lll(B) 

3) end 

4) = B • round(B" 1 x 4 ), i = l,...,S; 

5) if (maXj- =li ... |S ||x 3 - Xj|| 2 ) = 

6) return B 

7) else 

8) / = argmin 3e{/ ||| i; _x ; || 2 >o} ||xj - Xj\\ 2 ; 

9) d = X/ -x /; 

10) e = B *d; 

11) I = argmin ie{p | 9p ^o} 

12) recurseTl(B/,5); 

13) end 



Let B^ ) denote the initial estimate of a basis of the lattice, which is constructed, for example, by 
selecting as its columns a subset of N linearly independent vectors in O (Algorithm [T] line [TJ. Hence, 
each vector of the initial basis B^ ) can be expressed as a linear combination with integer coefficients of 
the columns of B a .. Thus, we can write B^ ) = B x A, with det(A) = m and m G Z \ {0}. From this, it 
follows that |£(B(°))| = m ■ \C X \ and \C X \ < |£(B< ))| 

Let B^ r ) denote the estimate obtained after the r-th call of the recursive function recurseTI. It is 
possible to prove the following lemma: 

Lemma 5.1: |£(B( r+1 ))| < |£(B( r ))|, with equality if and only if S C £(BM) = £(B( r+1 )) 

Proof: If S C £(B( r )), then B^ r+1 ^ = B^ r ^ and the recursion terminates. Otherwise, let z G 
S \ £(BM) be any of the points which does not belong to the lattice defined by B", v any of the 
vertices of ^(B^')) and d = z — v. The vector d can be expressed in terms of the basis B( r ) as 

d = B^O. (18) 

By definition, the vector z belongs to "P z (B^ r )), hence — 1 < 0i < 1. Since z i C(B^), z does not 
belong to the vertices of ^(B^). It follows that there is at least one coefficient Q\ in the basis expansion 
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of d, such that < \6i\ < 1. 

The vector d replaces the i-th column of B( r ) to obtain B^\ From Cramer's rule, 

det(Bj r) ) = 0idet(BM) (19) 

Therefore, if we select I, such that < |#/| < 1, 

|£(B( r+1 ))| = |det(B^ r+1 ))| = |det(B[ r) )| = |^||det(B^)| < |det(B( r >)| = \C(B^)\ (20) 

Note that there must be at least one such an index I, as indicated above. 

■ 

We construct the sequence of integer numbers 

s r = |£(BM)|, r = 0,l,...,R. (21) 

Let R denote the smallest integer such that |£(B^)| = |£(B^ +1 ^)|. That is, R is the number of steps 
needed to achieve convergence. It is possible to prove the following theorem: 
Theorem 5.2: Algorithm [T] converges to the solution of ( p~5] t. 



Proof: Let £* denote the solution of ( [T5] ), i.e., the lattice with maximum volume that includes all 
observed vectors S. We need to prove that C(bW) = £*. 

First, we prove that |£(B^)| cannot decrease beyond |£*|, i.e., \C*\ < |£(B( R ))|. To this end, let 
£(B( fl_1 )) denote the lattice obtained at the iteration just before convergence. Hence, there is at least one 



observed vector x G £* such that x ^ ^B^ -1 )). Lemma |5d] establishes that |£(B^)| < ^(B^ -1 ))!. 

Let d denote the difference vector as in line [7] of Algorithm [2| By construction, d £ £*. Let B* denote 
a basis for £*. Then, it is possible to write d = B*0*, 9* G Z. £(B( fl_1) ) is a sublattice of £*. Hence, 
g(ij-i) _ where A is a matrix of integer elements such that det(A) = m, with m G Z \ {0}, and 

\L(B( R -V)\/\£*\ =m. 

It is possible to express d in the basis expansion of B^ -1 ). That is, 

6 = (Bt fl -^) -1 d = (B*A) _1 B*0* = A _1 0* = det | A ^ cofactor(A)0*. (22) 

Note that both the cofactor matrix cofactor(A) and 6* have integer elements. Hence, the vector cofactor(A)0* 
has integer elements. Any nonzero element of 6 is an integer multiple of l/det(A) = 1/m. Therefore, 
if Qi / 0, {Oil > 1/m. 



From the proof of Lemma 5.1 we know that 



|£(B^))| = \ei\\£(B^ R -^)\ > — |£(B^ -1 ))| = |£*|, (23) 

m 
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where Oi is one of the nonzero elements of 0. 

To prove that |£(B (R) )| = \C*\, it remains to be shown that cannot be |£(B (R) )| > \C*\. Indeed, if 
this were the case, C(B^) would be the optimal solution of (15 1, since it includes all observed points 
S and has volume larger than \C*\. ■ 

Note that R < oo, i.e., convergence is achieved in a finite number of steps. Indeed, {s r } is a sequence 



of integer values. The sequence is monotonically decreasing due to Lemma 5.1 until convergence is 
achieved and S C £(B^). In addition, it is bounded from below by \C X \. Therefore, convergence is 
achieved in up to |£(B(°))|/|£ X | number of steps. In the following section we show that with a specific 
instantiation of Algorithm [2] given in Algorithm [3] it is possible to ensure a significantly faster convergence 
rate. 

B. Rate of convergence 

It is possible to prove that the proposed method implemented according to the instance presented in 
Algorithm [3] converges in a number of steps that is upper bounded by [log 2 (|£(B(°))|/|£ x |)] . To show 
this, it suffices to demonstrate that the value of the lattice determinant is (at least) halved between two 
consecutive calls of recurseTI, as stated by the following theorem. 

Theorem 5.3: If S £ £(BM), then ^rgmSyr ^ 2 

Proof: Since S <jt £(B( r )), then max^i s ||xj — x j 1 1 2 > 0, and the recursion is not terminated. 
Consider the vector d = Xf — X/, which can be expressed in the basis B^ r ) as d = ~B^ r '0. Dropping the 
superscript ( r \ it is possible to write 

= B- 1 d = B- 1 (x / -x / ) (24) 
= B^x/ - B^B • round(B" 1 x / )) (25) 
= B _1 x^ — round (B^xy) = a — round(a), (26) 

where we set a = B _1 xj. Due to the properties of rounding, —1/2 < 9i < 1/2. Thus, replacing any of 

the columns of B^ r ) such that 61 ^ 0, we obtain, using Cramer's rule, 

|£(B( r+1 ))| 1 

'"'<« (27) 



|£(BM)| 1 1 2 



Based on Theorem 5.3 



|£(BW)| < |£(B(°))|, Vr >0,<S^£(BW) (28) 
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Hence, convergence is achieved in up to 



log; 



|£(B(°))| 



I A 



(29) 



number of steps. 

Note that this upper bound on the convergence rate is guaranteed solely on the basis of the way the 
vertex of the parallelotope is selected, whereas it does not depend neither on which point is selected, nor 
on which column is replaced. However, the heuristics applied in Algorithm [3] are based on the rationale 
of reducing the ratio Tgrn^yrr as much as possible. 



C. Probability of success 



In Section V-A we showed that the proposed method converges to the optimal solution £* of ( fT5| ). In 
this section, we show that it converges to the correct (and unique) lattice C x (i.e., £* = C x ) with high 
probability, provided that the number of observed vectors P is greater than N. 

Given a lattice C x of rank N embedded in R. N , there is more than one sub-lattice £ of £ of index m. 
It can be shown that the number of sub-lattices is equal to IT331 

N+j-1 _ 

-—rnii'hr-r 

i=l j=l 



1 N-l ti+j 

M-)=nn ? v 



(30) 



. „ . I'i 1 , i ; i - 1 

where m = p' 1 • • ■pg" is the prime factorization of m. That is, p\, . . . ,p q are the prime factors of m, and 
t s is the multiplicity of the factor p s . 

For example, when N = 2 and m = 2, /2(2) = 3. Given the basis B = I, the corresponding sub-lattices 
of £(B) are generated by, e.g, the following bases 



B, 



1 -1 




2 




2 




, B 2 = 




, B 3 = 




-1 1 




1 




1 



(31) 



In order to determine analytically a lower bound on the probability of converging to the correct solution, 
we need to prove the following lemma, which provides bounds on the number of sub-lattices. 

Lemma 5.4: Given a lattice C x of rank N embedded in M. N , the number fN(m) of sub-lattices of 
index m is bounded by 



in 



N-l 



< fN(m) < m N . 



(32) 



Proof: It is possible to derive both an upper and a lower bound on the number of sub-lattices that are 



independent from the prime factorisation of m starting from ( |30| ). Since for all cases of interest N > 1, 
we have: 
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N+j-l N+j-1 

P -^—± > (33) 
Pi - 1 Pi 



Substituting in ( |30| ), we have a function /tv(?ti) that is guaranteed to yield values below /A?(m): 



9 *i AT+j-1 

M™)=nn p -v- ^ 

This can be simplified to: 

Mm) = f[^ N - l) . (35) 

i=i 

This is equivalent to the (N — l) th power of the product of the prime factors of m. That is, the lower 
bound of /at (to) can be expressed as: 

fN_(m) = m N -\ (36) 
In terms of the upper bound of /at (to), we proceed similarly by starting with the observation that: 



N+j-l _ -> N+j 

S < ^V- < 37 ) 

Pi - 1 Pi 



By substituting back into (30), we can observe that: 



1 U N+j 

nnv =m ^- (38) 

i=l j=l Pi 

Hence, it is easy to see that the upper bound on /jv(m) can be expressed as: 

Mm) = m N . (39) 
Therefore, since /iv(m) < /jv(wi) < /at (to), we have: 

m"" 1 < /Ar(m) < (40) 

■ 

Now, consider a specific sub-lattice £ C £ x of index to and a set of P vectors from the original lattice 
C x . In the case of uniformly distributed vectors, the probability that one vector belong to the sub-lattice 
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C is equal to (l/m). Thus, the probability that all P vectors belong to the same sub-lattice £ is equal 
to (l/m) p , assuming statistical independence among the set of vectors. 

Let Pi a n(N, P) denote the probability of failing to detect the underlying lattice C x of rank N, when 
P points are observed. Then, p succ (N, P) = 1 — p{ a n(N, P). A failure occurs whenever all P vectors fall 
in any of the sub-lattices of index m. Hence, we can write 



,>UN,P)<J2fN(m) - <J2m N (-) = J2^w = C(P-N)-l (41, 

m=2 ^ ' m=2 ^ ' m=2 



The first inequality is a union bound, i.e., the probability of failure is upper bounded by the sum of 
the probabilities of observing all P vectors in a given sub-lattice. The second inequality follows from 



the upper bound given by Lemma 5.4 The last expression contains £(•), which is the Riemann's zeta 
function. That is, 

coo = E — ■ ( 42 > 



m=l 



Note that the infinite series converges when the real part of the argument s is greater than 1 . In our case, 
this requires P — N>lorP>N + l. Then, the probability of success is lower bounded by 

P,ucc(N,P) >2-((P-N). (43) 

It is interesting to observe that the probability of failure/success depend solely on the difference P — N. 
Hence, the number P of observed vectors needed to correctly identify the underlying lattice grows linearly 
with the dimensionality N of the embedding vector space, despite the number of potential lattice points 
grows exponentially with N, as indicated in Section [IV] 

Figure [3] shows that the upper bound on the probability of failure rapidly decreases to zero even for 
modest values of P — N. 



VI. Experiments 

Section [V] provided a lower bound on the probability of successfully identifying the transform and the 
quantization step sizes. In this section, this aspect is evaluated experimentally. In addition, we provide 
further insight on the complexity of the algorithm, expressed in terms of the number of recursive steps 
needed to compute the sought solution. 

To this end, we generated data sets of iV-dimensional vectors, whose elements are sampled from 
a Gaussian random variable M(0, a 2 ). We considered the adverse case in which the elements are 
independent and identically distributed. Therefore, the distribution of the vectors is isotropic and no 



18 




P - N 

Fig. 3. Upper bound on the probability of failure pf li a(P,N). 
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Fig. 4. (a) Empirical probability of success of Algorithm [T] in identifying the transform and the quantization step sizes as a 
function of the number of observed vectors P and the dimensionality of the embedding vector space N. (b) Number of observed 
vectors P needed to achieve p succ (N, P) > 1 — e, with e = 10 -15 . 



clue could be obtained from a statistical analysis of the distribution. Without loss of generality, we set 
a = 2, W = I and Aj = 1, i = 1, . . . , N. The same results were obtained using different transform 
matrices and quantization step sizes. 
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Figure 4(a) shows the empirical probability of success when N = 2,4, 8, 16, 32, 64, and the number 
of observed vectors P is varied, averaged over 100 realizations. As expected Ps UCC (N,P) = when the 
number of vectors P does not exceed the dimensionality of the embedding vector space, i.e., P < N. 
Then, as soon as P > N, p SUC c(N,P) grows rapidly to one, when just a few additional vectors are 
visited. More specifically, Figure |4(b)| illustrates the number of observed vectors P needed to achieve 
Psucc(N,P) > 1 — e, where e was set equal to 10~ 15 . It is possible to observe that, when N > 2, the 
number of observed vectors needs to exceed by 6-7 units the dimensionality, and such a difference is 
independent from N, as expected based on the analysis in Section [Vj Note that the results shown in 
Figure [4] are completely oblivious of the specific implementation of Algorithm [2] 

At the same time, it is interesting to evaluate the complexity when the specific instance of Algorithm [2j 
namely Algorithm [3] is adopted. Figure [5] shows the total number of recursive calls needed to converge 
to the solution of ( p"5| ). Note that when a large enough number P of vectors is observed, the algorithm 
converges to the correct lattice C x . Thus, visiting additional vectors does not increase the number of 
recursive calls, since the base step of the recursion is always met. Figure [5] shows two cases, that differ 
in the way the set of observed vectors is visited, i.e., randomly, or sorted in ascending order of distance 
from the origin of the vector space. In both cases, the number of recursive calls grows linearly with N. 



This is aligned with the analysis in Section V-B which shows that convergence proceeds at a rate such 
that the number of recursive steps is upper bounded by |~log 2 |£(B(°))|/|£a;|~|. A (loose) bound on the 
lattice determinant is given by 

|£(B(°))| = |det(B(°))| < ||bf || 2 ||bf || 2 • ||b£>|| a < ||b&ir, (44) 

where the first inequality stems from Hadamard inequality and b^ax is the column of B^ ) with the 
largest norm. Therefore, 

[log 2 \C(BW)\/\L X \] < fJVlog a (||b^[| 2 )/|^H (45) 

This explain the dependency on N, as well as the fact that sorting the vectors so as to initialize B(°) 
with shorter vectors reduces the number of recursive calls. 



VII. Conclusions 

In this paper we proposed a method which is able to identify the parameters of a transform coder from 
a set of P transform decoded vectors embedded in a A r -dimensional space. We proved that it is possible 
to successfully identify the transform and the quantization step sizes when P > N and this despite of 
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Fig. 5. Total number of recursive calls to recurseTI as a function of the dimensionality of the space TV and the strategy 
adopted to visit the observed vectors. 

the huge number of potential quantization bins, which grows exponentially with N for a target bitrate. In 
addition, we proved that the probability of failure decreases exponentially to zero when P — N increases. 
In our experiments we found that an excess of approximately 6-7 observed vectors beyond the dimension 
N of the space is generally sufficient to ensure successful convergence. 

In this paper, we focused on a noiseless scenario, in which we observe directly the output of the 
decoder. In some cases, though, signals are processed in complex chains, in which multiple transform 
coders are cascaded, thus introducing noise in the observed vectors. Consequently, the observed vectors 
do not lie exactly on lattice points. Extending the proposed method to this new scenario represents an 
interesting research avenue to be investigated. 
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